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Abstract 

D 1 Influence diffusion and influence maximization in large-scale online social networks (OSNs) have been exten- 

■ sively studied, because of their impacts on enabling effective online viral marketing. Existing studies focus on social 

£^ ■ networks with only friendship relations, whereas the foe or enemy relations that commonly exist in many OSNs, e.g., 

Epinions and Slashdot, are completely ignored. In this paper, we make the first attempt to investigate the influence dif- 
fusion and influence maximization in OSNs with both friend and foe relations, which are modeled using positive and 
negative edges on signed networks. In particular, we extend the classic voter model to signed networks and analyze 
the dynamics of influence diffusion of two opposite opinions. We first provide systematic characterization of both 
short-term and long-term dynamics of influence diffusion in this model, and illustrate that the steady state behaviors of 
the dynamics depend on three types of graph structures, which we refer to as balanced graphs, anti-balanced graphs, 
and strictly unbalanced graphs. We then apply our results to solve the influence maximization problem and develop 
efficient algorithms to select initial seeds of one opinion that maximize either its short-term influence coverage or 
long-term steady state influence coverage. Extensive simulation results on both synthetic and real-world networks, 
such as Epinions and Slashdot, confirm our theoretical analysis on influence diffusion dynamics, and demonstrate the 
efficacy of our influence maximization algorithm over other heuristic algorithms. 
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1 Introduction 

As the popularity of online social networks (OSNs) such as Facebook and Twitter continuously increases, OSNs have 
become an important platform for the dissemination of news, ideas, opinions, etc. The openness of the OSN platforms 
and the richness of contents and user interaction information enable intelligent online recommendation systems and 
viral marketing techniques. For example, if a company wants to promote a new product, it may identify a set of 
influential users in the online social network and provide them with free sample products. They hope that these 
influential users could influence their friends, and friends of friends in the network and so on, generating a large 
influence cascade so that many users adopt their product as a result of such word-of-mouth effect. The question is how 
to select the initial users given a limited budget on free samples, so as to influence the largest number of people to 
purchase the product through this "word-of-mouth" process. Similar situations could apply to the promotion of ideas 
and opinions, such as political candidates trying to find early supporters for their political proposals and agendas, 
government authorities or companies trying to win public support by finding and convincing an initial set of early 
adopters to their ideas. 

The above problem is referred to as the influence maximization problem in the literature, which has been exten- 
sively studied in recent years ll8l Fl^[l^lFT8ll20ll^l25ll29ll30l . In these studies, several influence diffusion models 
are proposed to formulate the underlying influence propagation processes, including linear threshold (LT) model, in- 
dependent cascade (IC) model, voter model, etc. A number of approximation algorithms and scalable heuristics are 
designed under these models to solve the influence maximization problem. 
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However, all existing studies only look at networks with positive (i.e., friend, altruism, or trust) relationships, 
where in reality, relationships also include negative ones, such as foe, spite or distrust relationships. In Ebay, users 
develop trust and distrust in agents in the network; In online review and news forums, such as Epinions and Slashdot, 
readers approve or denounce reviews and articles of each other. Some recent studies [11 23 24] already look into 
the network structures with both positive and negative relationships. As a common sense exploited in many existing 
social influence studies [8-10, 16 20], positive relationships carry the influence in a positive manner, i.e., you would 
more likely trust and adopt your friends' opinions. In contrast, we consider that negative relationships often carry 
influence in a reverse direction — if your foe chooses one opinion or votes for one candidate, you would more likely 
be influenced to do the opposite. This echoes the principles that "the friend of my enemy is my enemy" and "the 
enemy of my enemy is my friend." Structural balance theory has been developed based on these assumptions in social 
science (see Chapter 5 of ITT? ] and the references therein). We acknowledge that in real social networks, people's 
reactions to the influence from their friends or foes could be complicated, i.e., one could take the opposite opinion of 
what her foe suggests for one situation or topic, but may adopt the suggestion from the same person for a different 
topic, because she trusts her foe's expertise in that particular topic. In this study, we consider the influence diffusion 
for a single topic, where one always takes the opposite opinion of what her foe suggests. This is our first attempt 
to model influence diffusion in signed networks, and such topic-dependent simplification is commonly employed in 
prior influence diffusion studies on unsigned networks ll8T [T0l[T6ll 1 8ll20l . Our work aims at providing a mathematical 
analysis on the influence diffusion dynamic incorporated with negative relationship and applying our analysis to the 
algorithmic problem of influence maximization. 

1.1 Our contributions 

In this paper, we extend the classic voter model 11131191 to incorporate negative relationships for modeling the diffusion 
of opinions in a social network. Given an unsigned directed graph (digraph), the basic voter model works as follows. At 
each step, every node in the graph randomly picks one of its outgoing neighbors and adopts the opinion of this neighbor. 
Thus, the voter model is suitable to interpret and model opinion diffusions where people's opinions may switch back 
and forth based on their interactions with other people in the network. To incorporate negative relationships, we 
consider signed digraphs in which every directed edge is either positive or negative, and we consider the diffusion 
of two opposite opinions, e.g., black and white colors. We extend the voter model to signed digraphs, such that at 
each step, every node randomly picks one of its outgoing neighbors, and if the edge to this neighbor is positive, the 
node adopts the neighbor's opinion, but if the edge is negative, the node adopts the opposite of the neighbor's opinion 
(Section|2]i. 

We provide detailed mathematical analysis on the voter model dynamics for signed networks (Section[3]). For short- 
term dynamics, we derive the exact formula for opinion distribution at each step. For long-term dynamics, we provide 
closed-form formulas for the steady state distribution of opinions. We show that the steady state distribution depends 
on the graph structure: we divide signed digraphs into three classes of graph structures — balanced graphs, anti- 
balanced graphs, and strictly unbalanced graphs, each of which leads to a different type of steady state distributions of 
opinions. While balanced and unbalanced graphs have been extensively studied by structural balance theory in social 
science |[T4"ll . the anti-balanced graphs form a new class that has not been covered before, to the best of our knowledge. 
Moreover, our long-term dynamics not only cover strongly connected and aperiodic digraphs that most of such studies 
focus on, but also weakly connected and disconnected digraphs, making our study more comprehensive. 

We then study the influence maximization problem under the voter model for signed digraphs (Section 0). The 
problem here is to select at most k initial white nodes while all others are black, so that either in short term or long term 
the expected number of white nodes is maximized. This corresponds to the scenario where one opinion is dominating 
the public and an alternative opinion (e.g. a competing political agenda, or a new innovation) tries to win over 
supporters as much as possible by selecting some initial seeds to influence on. We provide efficient algorithms that 
find optimal solutions for both short-term and long-term cases. In particular, for long-term influence maximization, 
our algorithm provides a comprehensive solution covering weakly connected and disconnected signed digraphs, with 
nontrivial computations on influence coverage of seed nodes. 

Finally, we conduct extensive simulations on both real-world and synthetic networks to verify our analysis and to 
show the effectiveness of our influence maximization algorithm (Section|5]l. The simulation results demonstrate that 
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our influence maximization algorithms perform much better than other heuristic algorithms. 

To the best of our knowledge, we are the first to study influence diffusion and influence maximization in signed 
networks, and the first to apply the voter model to this case and provide efficient algorithms for influence maximization 
under voter model for signed networks. Our identification of the class of anti-balanced graphs could also be of 
independent interest in the study of social networks with both positive and negative relationships. 

1.2 Related work 

To the best of our knowledge, we are the first to study the influence diffusion and influence maximization on signed 
digraphs. In this subsection, we discuss the topics that are closely related to our problem, such as: (1) influence 
maximization and voter model, (2) signed networks, and (3) competitive influence diffusion. 

Influence maximization and voter model. Influence maximization has been extensively studied in the literature. 
The initial work l20l proposes several influence diffusion models and provides the greedy approximation algorithm 
for influence maximization. More recent works |[8 Hl0l[T6l[T8ll2Tll231l2"9l study efficient optimizations and scalable 
heuristics for the influence maximization problem. In particular, voter model is proposed in lfl3l[T9l , and is good 
for modeling opinion diffusions in which people may switch opinions back and forth from time to time due to the 
interactions with other people in the network. Even-Dar and Shapira 1 16 1 study the influence maximization problem 
in the voter model on simple unsigned and undirected graphs, and they show that the best seeds for long-term influence 
maximization are simply the highest degree nodes. As a contrast, we show in this paper that seed selection for signed 
digraphs are more sophisticated, especially for weakly connected or disconnected signed digraphs. More voter model 
related research are conducted in physics domain, where the voter model, the zero-temperature Glauber dynamics for 
the Ising model, invasion process, and other related models of population dynamics belong to the class of models with 
two absorbing states and epidemic spreading dynamics (1 „27, 3 2 ] . However, none of these works study the influence 
diffusion and influence maximization of voter model under signed networks. 

Signed networks. The signed networks with both positive and negative links have gained attentions recently 1311221- 
[24). In M23U24L the authors empirically study the structure of real-world social networks with negative relationships 
based on two social science theories, i.e., balance theory and status theory. Kunegis et al. l22l study the spectral 
properties of the signed undirected graphs, with applications in link predictions, spectral clustering, etc. Borgs et 
al. O proposes a generalized PageRank algorithm for signed networks with application to online recommendations. 
None of the above work studies influence diffusion and influence maximization in signed networks. 
Competitive influence diffusion. A number of recent studies focus on competitive influence diffusion and maximiza- 
tion [2 4.6 7|, in which two or more competitive opinions or innovations are diffusing in the network. Although they 
consider two or more competitive or opposing influence diffusions, they are all on unsigned networks, different from 
our study here on diffusion with both positive and negative relationships. 

2 Voter model for signed networks 

We consider a weighted directed graph (digraph) G = (V, E, A), where V is the set of vertices, E is the set of directed 
edges, and A is the weighted adjacency matrix with Aij ^ if and only if G E, with Ay as the weight of 
edge (i, j). The voter model was first introduced for unsigned graphs, with nonnegative adjacency matrices As. In 
this model, each node holds one of two opposite opinions, represented by black and white colors. Initially each node 
has either black or white color. At each step t > 1, every node i randomly picks one outgoing neighbor j with the 
probability proportional to the weight of (i, j), namely Ay/ J^e An, and changes its color to j's color. The voter 
model also has a random walk interpretation. If a random walk starts from i and stops at node j at step i, then is color 
at step t is j's color at step 0. 

In this paper, we extend the voter model to signed digraphs, in which the adjacency matrix A may contain negative 
entries. A positive entry Ay represents that i considers j as a friend or i trusts j, and a negative Ay means that i 
considers j as a foe or i distrusts j. The absolute value | Ay | represents the strength of this trust or distrust relationship. 
The voter model is thus extended naturally such that one always takes the same opinion from his/her friend, and the 
opposite opinion of his/her foe. Technically, at each step t > 1, i randomly picks one outgoing neighbor j with 
probability |Ay|/^ £ \ An\, and if Ay > (or edge (i, j) is positive) then i changes its color to j's color, but if 
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Table 1 : Notations and terminologies 



G = (V,E,A), 
G=(V,E,A) 


G is a signed digraph, with signed adjacency matrix A and G is the unsigned version of G, with 
adjacency matrix A 


A+,A~ 


A + (resp. A~) is the non-negative adjacency matrix representing positive (resp. negative) edges of 
G, with A = A+ - A' and A = A+ + A~ . 


1, 7T, Xo, Xt, X, X e , 

Xq 


Vector forms. All vectors are \V | -dimensional column vectors by default; 1 is all one vector, n is 
the stationary distribution of ergodic digraph G\ xo (resp. x t ) is the white color distribution at the 
beginning (resp. at step t); x is the steady state white color distribution; x e (resp. x ) is the steady 
state white color distribution for even (resp. odd) steps. 


d, d+,d~,D 


d, d + , and d~ are weighted out-degree vectors G, where d = Al, d + = A + l, and d~ — A~l; 
D — didQ [gT| is the diagonal degree matrix filled with entries of d. 


P,P 


P = D~ l A is the signed transition matrix of G and P = D~ L A is the transition probability matrix 
of G. 


vz, Vs, Vz,S z 


Given a vector v, a node set Z C V, Vz is the projection of v on Z. Given a partition 5", S of V, vs 
is signed such that vs(i) = v(i) if i 6 5, and iis(i) = — v(i) if i S. Given a partition Sz, Sz of 
Z, vz,s z is taking the projection of v on Z first, then negating the signs for entries in Sz- 


I, is, B z 


I is the identity matrix. Is — diag[ls] is the signed identity matrix. Bz is the projection of a 
matrix B to Z C V. 



Aij < (or edge (i, j) is negative) then i changes its color to the opposite of j's color. The random walk interpretation 
can also be extended for signed networks; if the t-step random walk from i to j passes an even number of negative 
edges, then i's color at step t is the same as j's color at step 0; while if it passes an odd number of negative edges, then 
i's color at step t is the opposite of j's color at step 0. 

Given a signed digraph G = (V,E,A), let G + = (V 1 E+,A+) and G~ = (V,E~, A~) denote the unsigned 
subgraphs consisting of all positive edges E + and all negative edges E~, respectively, where A + and A~ are the 
corresponding non-negative adjacency matrices. Thus we have A = A + — A~ . Similar to unsigned digraphs, G is 
aperiodic if the greatest common divisor of the lengths of all cycles in G is 1, and G is ergodic if it is strongly connected 
and aperiodic. A sink component of a signed digraph is a strongly connected component that has no outgoing edges 
to any nodes outside the component. When studying the long-term dynamics of the voter model, we assume that all 
signed strongly connected components are ergodic. We first study the case of ergodic graphs, and then extend it to 
the more general case of weakly connected or disconnected graphs with ergodic sink components. Table [Uprovides 
notations and terminologies used in the paper. Note that one basic fact we often use in studying long-term convergence 
behavior is: If matrix P satisfies lim^oo P* = 0, then I — P is invertible and (/ — P)^ 1 = lim^oo J2l=o P % - 

3 Analysis of voter model dynamics on signed digraphs 

In this section, we study the short-term and long-term dynamics of the voter model on signed digraphs. In particular, 
we answer the following two questions. 

(i) Short-term dynamics: Given an initial distribution of black and white nodes, what is the distribution of black and 
white nodes at step t > 0? 

(ii) Convergence of voter model: Given an initial distribution of black and white nodes, would the distribution 
converge, and what is the steady state distribution of black and white nodes? 

3.1 Short-term dynamics 

To study voter model dynamics on signed digraphs, we first define the signed transition matrix as follows. 

Definition 1 (Signed transition matrix). Given a signed digraph G = (V, E, A), we define the signed transition matrix 
of G as P = D~ l A, where D = diag\dj\ is the diagonal matrix and di = X^ev l^u I ' s ^ ne weighted out-degree of 
node i. 
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Next proposition characterizes the dynamics of the voter model at each step using the signed transition matrix. 



Proposition 1. Let G = (V, E, A) be a signed digraph and denote x$ as the initial white color distribution vector, 
i.e., Xo(i) represents the probability that node i is white initially. Then, the white color distribution at step t, denoted 
by Xt can be computed as 

t-i 

x t = P t x + C£ j P i ) 9 -, (i) 

where g~ = D~ 1 A~1, i.e. g~ (i) is the weighted fraction of outgoing negative edges of node i. 

Proof. Based on the signed digraph voter model defined in Section|2] x t can be iteratively computed as 

jev 1 jev 
In matrix form, we have 

x t = D^Axt-i +D- 1 A~\ = Pxt-i +g~, (3) 
which yields Eq.([T]i by repeatedly applying Eq.©. □ 



3.2 Convergence of signed transition matrix with relation to structural balance of signed 
digraphs 

Eq.(fl~|i infers that the long-term dynamics, i.e., the vector xt when t goes to infinity, depends critically on the limit of 
P* and Yn=o P '• We snow below that the limiting behavior of the two matrix sequences is fundamentally determined 
by the structural balance of signed digraph G, which connects to the social balance theory well studied in the social 
science literature (cf. 1141 ). We now define three types of signed digraphs based on their balance structures. 

Definition 2 (Structural balance of signed digraphs). Let G = (V, E, A) be a signed digraph. 

1. Balanced digraph. G is balanced if there exists a partition S, S of nodes in V, such that all edges within S and 
S are positive and all edges across S and S are negative. 

2. Anti-balanced digraph. G is anti-balanced if there exists a partition S, S of nodes in V, such that all edges 
within S and S are negative and all edges across S and S are positive. 

3. Strictly unbalanced digraph. G is strictly unbalanced ;/ G is neither balanced nor anti-balanced. 

The balanced digraphs defined above correspond to the balanced graphs originally defined in social balance theory. 
It is known that a balanced graph can be equivalently defined by the condition that all circles in G without considering 
edge directions contain an even number of negative edges lfl4l . On the other hand, the concept of anti-balanced 
digraphs seems not appearing in the social balance theory. Note that balanced digraphs and anti-balanced digraphs are 
not mutually exclusive. For example, a four node circle with one pair of non-adjacent edges being positive and the 
other pair being negative is both balanced and anti -balanced. However, for studying long-term dynamics, we only need 
the above categorization for aperiodic digraphs, for which we show below that balanced digraphs and anti-balanced 
digraphs are mutually exclusive. 

Proposition 2. An aperiodic digraph G cannot be both balanced and anti-balanced. 

Proof. Suppose, for a contradiction, that an aperiodic digraph G is both balanced and anti-balanced. By the equivalent 
condition of balanced graphs, we know that all cycles of G have an even number of negative edges. Since an anti- 
balanced graph will become balanced if we negate the signs of all its edges, we know that all cycles of G also have 
an even number of positive edges. Therefore, all cycles of G must have an even number of edges, which means their 
lengths have a common divisor 2, contradicting to the assumption that G is aperiodic. □ 
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With the above proposition, we know that balanced graphs, anti -balanced graphs, and strictly unbalanced graphs 
indeed form a classification of aperiodic digraphs, where anti-balanced graphs and strictly unbalanced graphs together 
correspond to unbalanced graphs in the social balance theory. We identify anti-balanced graphs as a special category 
because it has a unique long-term dynamic behavior different from other graphs. An example of anti-balanced graphs 
is a graph with only negative edges. In general, anti-balanced graphs could be viewed as an extreme in which many 
hostility exist among individuals, e.g., networks formed by bidders in auctions fl5l l28ll . 

Case of ergodic signed digraphs. Now, we discuss the limiting behavior of P t of ergodic signed digraphs with three 
balance structures. A signed digraph G = (V, E, A) is ergodic if and only if for any node i, there always exists a 
signed path to any other node in G and the common divisor of all cycle path lengths of i is 1 . Here, a signed path R in 
a signed graph G is a sequence of nodes with the edges being directed from each node to the following one, where the 
length of the path, denoted as \R\, is the total number of directed edges in R. The sign of a path is positive, if there is 
an even number of negative edges along the path; otherwise the sign of a path is negative. Below, we first introduce 
Proposition[3]presenting that the balance structures of ergodic signed digraphs can be interpreted and distinguished in 
terms of the path lengths and path signs in G. As a result, Lemma[T]introduces the various limiting behaviors of P* of 
ergodic signed digraphs with respect to three balance structures. 

Proposition 3. Let G — (V, E, A) be an ergodic strictly unbalanced digraph. There exist two nodes i and j, and two 
directed paths from i to j with the same length but different signs. 

Proof. Given the following three statements, we prove Statement 1 Statement 2 => Statement 3, 
which in turn proves this proposition, i.e., -iStatement 3 => -"Statement 1. We assume that G is a signed 
ergodic digraph. 

Statement 1: For any two nodes i and j, all paths from i to j with the same length have same signs. 
Statement 2: For any two nodes i and j, all paths from i to j with even length have same signs. 
Statement 3: G is either balanced or anti-balanced. 

(1) Proof by contradiction for Statement 1 => Statement 2. We assume that in G, there exist two even length 
paths i? e i and R e 2 from i to j with different signs. Since G is ergodic, by Proposition |4] in Appendix|A] there must 
exist a path, denoted by R a , from j to i with odd length (no matter what sign it carries). Denote the length of these 
three paths as \R e i\, i? e 2 1 and |i? D |, respectively. 

Then, R c i = R e i + R a forms a cycle at node i with odd length \R e x \ + \R \ and R C 2 = R e 2 + Ro forms another 
cycle at i with odd length |i? e 2| + |-R G |. Clearly, two cycles i? cl and R c2 carry different signs. Then, let R' cl = i?|^ c2 
denote a cycle of node i, by continuing R c i for |i? C 2 times, which has the same sign with R c \ since |i? C 2| is odd. 
Similarly, we construct a cycle R' c2 = R& 1 ' by continuing R c2 for \R c i \ times, which has the same sign as R c2 . Thus 
R' cl and R' c2 have the same length of | R c \ \ R c2 | but different signs, which contradicts to Statement 1 . 

(2) Proof for Statement 2 => Statement 3. By Proposition 0] in Appendix lAl we know that between any two 
nodes there must exist even-length paths. By Statement 2, we partition V into S and S, based on the signs of even 
length paths originated from a particular node i £ V. More specifically, S contains the nodes to which all even length 
paths from i have positive signs, and S contains the other set of nodes (note that i may not be in S). 

We argue that (a) within S and S, all edges have same signs; and (b) all edges between S and S have same signs. 
Since G contains both negative and positive edges, it must be either balanced or anti -balanced. 

For (a), assume to the contrary that there exist two directed edges R a b = a — > b and R c d = c — > d, which both 
reside in the same set, e.g., S with different signs. (The case for S is similar.) 

We construct two even length paths from itoc and i to d as follows. 

R e (i, c) = R e (i, b) + R e (b, c), 

R e (i, d) = R e (i, a) + R ab + R e (b, c) + R cd 

where R e (x, y) represents the constructed even length path from node x to node y. 
Since both c,d £ S, by construction, then R e (i, c) and R e (i, d) have same signs 

sgn{R e {i, c)) = sgn(R e (i, d)). (4) 
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On the other hand, since a and b are in the same group as c and d, sgn(R e (i, a)) = sgn(R e (i, b)). Then, we have 



sgn(R e (i, c)) = sgn(R e (i, b))sgn(R e (b, c)), (5) 
sgn(R e (i, d)) = sgn(R e (i, a))sgn(R ab )sgn(R e (b, c))sgn(R cd ) 

= -sgn(R e (i,b))sgn(R e (b,c)). (6) 

Eq.© comes from the assumption that R ab and R c d have different signs. Eq.© contradicts with Eq.© and Eq.©. 

For (b), assume that there exist two edges R a b and R c d with different signs between S and S. Still consider the 
two even length paths R e (i, c) and R e (i, d) constructed before. Since c and d are not in the same side, R e (i, c) and 
R e (i, d) have opposite signs by the construction, i.e., 

sgn(R e (i,c)) = -sgn(R e (i,d)). (7) 

On the other hand, since a and b are in the different groups as well, sgn(R e (i, a)) = —sgn(R e (i, b)). Then, we have 

sgn(R e (i, c)) = sgn(R e (i, b)) ■ sgn(R e (b, c)), (8) 
sgn(R e (i, d)) = sgn(R e (i, a))sgn(R ab )sgn(R e (b, c))sgn(R cd ) 

= sgn{R e (i,b))-sgn{R e (b,c)). (9) 

However, Eq.© contradicts with Eq.© and Eq.©. This completes the proof. □ 

Next lemma characterizes the limiting behavior of P t of ergodic signed digraphs with all three balance structures. 
Given a signed digraph G = (V,E, A), let G = (V,E,A) corresponds to its unsigned version (A^ = |Ay for 
all i,j £ V). When G is ergodic, random walk on G has a unique stationary distribution, denoted as it. That is, 
7r T = ir T P, where P = D~ 1 A is the transition probability matrix for G. Henceforth, we always use S, S to denote 
the corresponding partition for either balanced graphs or anti-balanced graphs. We define the infinity norm of matrix 
M e K™ as: := ma Xl < 4 < m E™ i |My|. 

Lemma 1. Given an ergodic signed digraph G = (V, E, A), let G = (V, E, A) be the unsigned digraph. When G is 
balanced or strictly unbalanced, P converges, and when G is anti-balanced, the odd and even subsequences of P 
converge to opposite matrices. 



Balanced G: lim t _ i . 0O P t — lgWa] 

Strictly unbalanced G: lim t _ ! . 0O P = 0; 

Anti-balanced G: lirxit-^x, P 2t =lsTTg, limt^oo P 2t+1 = —l s ng. 

Proof. (1) When G is balanced, the signed transition matrix Pcan be written as P = Is Pis- Since G is ergodic, we 
have lim t ^oo P* = lir T . Thus, 

lim P* = lim (IsPIsY = is*s, 

t—too t—too 

where we use simple facts 1% = I, jgl = I5, and tt t Is = ^5. 

(2) When G is anti-balanced, we have P = —IsPIs- Thus, 

lim P 2t - lim (-i s Pi s ) 2t = Istts 

t— >OQ t—¥OQ 

lim P 2t+1 = lim (~i s Pis) 2t+1 = -ls*§. 

t—>oo t—>oo 

(3) By Proposition [3] given a signed strictly unbalanced digraph G, there exist a pair of nodes i and j, such that two 
paths Ri and R2 from i to j have the same length t(i) and opposite signs. Consider a random walk from i. Let p\ 
(resp. p 2 ) be the probability that the walk exactly follows Ri (resp. i? 2 ) m me first steps. Let R*^f} be the set of 
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all paths from i to k with length Then, for a unit vector e$ with z-th entry equal to 1 and other entries as 0, we 
have 



°JP m \\x = 



Prob[R]sgn(R) 



RER t 



< 1 - min(pi,p 2 ) = pi 



For any other node i', there must exist a path i?' from i' — > i, due to the ergodicity of G, thus two paths R[ = 
R' + Ri and R' 2 = R' + Rz from i' to j have the same length, but opposite signs. With similar arguments as that for 
\ef,P e{l,) \\i < Pv holds for any i' £ V. Let p = maxj pi < 1 and I — max^ we conclude for any i G V, 



node i, 

WeTP e 



< p holds. Hence, when t > T = 2£, the following inequality holds 



Hence lim* 



IP* 



= 0, i.e., lim* 



P* = 0. 



□ 



The above lemma clearly shows different convergence behaviors of P* for three types of graphs. In particular, P* 
of anti-balanced graphs exhibits a bounded oscillating behavior in the long term. 

Case of weakly connected signed digraphs. Now, we consider a weakly connected signed digraph G = (V, E, A) 
with one ergodic sink component Gz with node set Z, which only has incoming edges from the rest of the signed 
digraph Gx with node set X = V \ Z. Then, the signed transition matrix P has the following block form. 



P = 



P_xj_Py 
' P z 



(10) 



where Px and Pz are the block matrices for component Gx and Gz, and Py represent the one-way connections from 
Gx to Gz- Then, the i-step transition matrix P* can be expressed as 



P' 



P 



(0 



P 



Y 

~ (J) ' 



(ID 



where P 



(0 



X 



pi-, p¥ 



Y 



ELd PxPyPz' 1 ' 1 - When G z is balanced or anti-balanced, we use 



Sz , Sz to denote the partition of Z defining its balance or anti -balance structure. Then, we denote column vectors 



u b = {Ix-Px)- 1 PyU,s z , 
and u u = (I x + Px r 1 P Y iz,s z ■ 



(12) 
(13) 



The reason that Ix — Px is invertible is because lim^oo P x = 0, which is in turn because there is a path from 
any node i in Gx to nodes in Z (since Z is the single sink), and thus informally a random walk from i eventually 
reaches and then stays in Gz- The same reason applies to Ix + Px- Lemma [2] provides the formal proof of the fact 
linii^oo P x = 0. 

Let ttz denote the stationary distribution of nodes in Gz, and ttz,s z I s signed, with ttz,s z W = ^z{i) for i e Sz, 
and ftz,Sz(i) = —^z(i) for i G Z \ Sz- Lemma|2]discloses the convergence of P* given various balance structures 
of Gz-' 

Lemma 2. For weakly connected signed digraph G = (V, E, A) with one ergodic sink components, with signed 
transition matrix given in Eq.il It, we have 



Balanced Gz- 
Strictly unbalanced Gz- 
Anti-balanced Gz- 



lim^oo P* 

limt^oo P 







U b^Z,S 2 



I lz,s z * z , Sz 
1 



lim. 



p2t 







l u"z,S z 



\ lz,Sz^ Z ,Sz 



lim 



t— ^oo 



p 



2t+l 



o , 

1 



f Z,Sz^Z,Sz 
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Proof. We discuss the convergence of P x , P|, and Py in Eq.dTTb. respectively. 
(1) We first prove that P x converges to 0, i.e., lim^oo P x = 0. 

Since Gx does not contain sink components, any node iel has a path to component Gz- Let Riz be the shortest 
path from i to some node in Z, and Prob[Riz] denote the probability that a random walk starting from i takes the path 
Riz- Hence we denote 

p = min ProblRiz], and m — max \Riz\, 

which implies that starting from any node i £ I, after m steps of random walk, there is at least probability p that it 
reaches component Gz- Hence, we have \\P' X Hoc < (1 — p) < 1- Let T = 2m, then for any t > T, we have 



Halloo = ||P| m ||oc < (1 -P) 1 ^ < (1 -P)*, 

which implies lim t ^oo ||Pxl|oo = 0, i.e., lim^oo P x = 0. 

(2) For subgraph Gz, Lemma[T]directly yields 

{0, Strictly unbalanced Gz', 

lz s^tt? ? i Balanced Gz; 
, ,bz f' Sz ' . . , , ' , (14) 

Iz.Sz 71 "^ s z ' Anti-balanced tz, even t; 

—1z,s z 'k'z s ' Anti-balanced Gz, odd t. 

(3) Below, we focus on proving the results on lim^oo P$9 using Proposition|6]in AppendixlBl 

When Gz is strictly unbalanced, from Lemma[T]and (1) in this proof, lim^oo P x = and lim^oo P| = hold, 
thus by Proposition|6]in Appendix IB"! lim^ ^ Py ' = 0. 

When Gz is balanced, Lemma Q] and Proposition in Appendix lAl directly yield (Pz — ^z,s z ' K 'z s z Y = Pz ~ 
lz,s z 7ris z for any integer t > 0, and lim t _^ oc (F z - lz.Sz 71 "!,.^ )' = °> thus 

t-i 

lim = lim Y^PxPy^z 1 ' 1 - lz,s z nl Sz + U,s z *ls z ) 

t— >oo £— >oo L — • ' ' 

4=0 

t-1 t-2 

= lim J2 P x p y(Pz ~ U^lsJ- 1 -* + lim E^^ 1 ^^ 

t— >oo A — » ' t— ► oo ^ — • 

1=0 i=0 



(Ix - Px) 1 Py1 Z ,s z ^z, Sz = u bK T z ,s. 



where the first term in the second line being is due to Proposition|6](ii) in AppendixlBl 

When Gz is anti-balanced, applying Lemma Q] and Proposition in Appendix [A] we have for any integer t > 0, 

(Pz + lz,s z nl,s z y = pt z- (- 1 ) tl 2,s z 7rI, Sz , and lim t ^ 00 (P z + ls.s,***,)* = hold true, thus 



lim PP = lim Y^PxPriPl 1 - 1 {-^^2,3 As E ~ U,s,*Z,s E )) 

t— >-00 t— >OQ L ' ' ' 

i=0 

t-1 t-2 

= }^L P x p y(Pz + Iz^lsJ- 1 - 1 + to^H)'" 1 "^ 1 ^^ 

i=0 i=0 
t-2 

= lim E(- p ^)'^ 1 ^s z ^5 z - (-^\lx +Px)- 1 Py1z,s z k T z, Sz 



i=0 



Z.Sz 



Hence, we have for anti-balanced Gz'- lim^oo Py^ = —u u tcz Sz , and lim^oo p|, 2t+1 ) = u u Tf^ s . □ 
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Multiple sink components and disconnected signed digraphs. When there exist m > 1 ergodic sink components, 
i.e., Gzi, Gz2, • • • j Gzm, the rest of the graph G is considered as Gx- Then the signed transition matrix P and P t 
can be written as 





' Px 


Pyi\ 




1 Pyra 




pt 

r x 






; P (t) i 







Pzi I 





i 







pt r 
r zi L 





; 


p = 




1 



















i 




i 





o ; 




! o 







o ; 





! P Zm . 






o ; 





, r Zm J 



(15) 



where Py\ — 53*=o Px^YiPz^K 1 < i < m. Hence, each sink ergodic component Pzi along with and 
Pyi independently follows Lemma [2] For disconnected signed digraph, with m > 1 ergodic or weakly connected 
components, each of which satisfies Lemma[TJor Lemma|2] respectively. For brevity, we omit the details here. 



3.3 Long-term dynamics 

Based on the structural balance classification and the convergence of signed transition matrix discussed above, we are 
ready now to analyze the long-term dynamics of the voter model on signed digraphs. Formally, we are interested in 
characterizing x t with t — >• oo, i.e., 

t-i 

x = lim x t = lim (P t x + (V P i )g~). (16) 

t— »oo t— >oo — J 

i=0 

If the even and odd subsequences of xt converge separately, we denote x e = linit-j.oo X2ti x a = lim^oo X2t+i- 
Before presenting the results on long-term dynamics of voter model, we first introduce the following useful lemma 
connecting a signed digraph G with another graph G" where all edge signs in G are negated. 

Lemma 3. Given a signed digraph G = (V, E, A), let G' — (V, E 1 ~A) be a signed digraph with all edge signs 
negated from G. Then, for any initial color distribution xq, at any 2t steps (t > 0), the color distributions xit (G) on 
G and X2t{G') on G' are identical. 

Proof. Let P' = —P denote the signed transition matrix of G", and denote the vector g~ = D~ 1 A~\ and g'~ = 
D~ 1 (-A)-l = D~ 1 A+1. Thus.g'~ = 1 -g~. By Eq.©, after two steps, we have 

x 2 {G') = P' 2 x + P'g'- + g'- = P 2 x - P(l - g~) + 1 - g~ = P 2 x Q + Pg- + g~ = x 2 (G), 

where the last equality uses facts 1 = D~ 1 Al and P = D~ 1 A. Since the lemma holds for two steps, then clearly it 
holds for all even steps. □ 

Next theorem discusses the case of ergodic signed digraphs. 
Theorem 1. Let G = (V, E, A) be an ergodic signed digraph, we have 

Balanced G: x = Is^s^o - \l) + \l (17) 

Strictly unbalanced G: x = ^1 (18) 

Anti-balanced G: x e = ls^g(xo — ^1) + \l (19) 

x = -is*s(x - \1) + \1 (20) 
Proof. We discuss the limit in Eq. ( [Tol l for three possible balance structures of G. 

Balanced digraphs. From Lemma[TJand Proposition[5]in AppendixlAl it is easy to prove P m — ls^s = (P~ ^S^s )™ 
for any integer m > 0, which yields the following result on the second part in Eq. ([Tol l. 



lim V P'g- = (I-P + is^rV + lim V l s j% g~ (21) 

^•OO z * t^-OO £ * 

= (I-P + i s ^)- 1 g- = ll-h s ^l, (22) 



2 2 
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where the last term of Eq.ffJT]) is canceled out due to the digraph flow circulation law liT2l|26l , i.e., 

t%g- = TgD-u-i = J2 J2 P * - E E ^ = °- 

«es jgs ies jes 

The last equality in Eq.(l22l> holds because 

^(i-p + i s ^)(i-i s ^i)-g- = 0. 

Eq.(fT7l) is obtained by combining Eq. d22l) with Lemma[T] 

Anti-balanced Digraphs. Lemma [3] directly yields Eq.dT9l>. The odd step influence distribution sequence is obtained 
by 

rp 1,1 

x a = Px e + g = -lsn s i x o - ^) + ^ 

Strictly unbalanced digraphs. From TheoremQ] Iimt->oo P t = holds and thus we have 

t-i 

lim V P l g- = (I — P)~ l g- = (D- A)~ X A~\ = -1. (23) 

i=0 

The last equality comes from the facts (D — A)l = 2A~1. □ 

Theorem[T]has several implications. First of all, for strictly unbalanced digraphs, each node has equal steady state 
probability of being black or white, and it is not determined by the initial distribution xq. Secondly, anti -balanced 
digraphs has the same steady state distribution as the corresponding balanced graph for even steps, and for odd steps, 
the distribution oscillates to the opposite (x = 1 — x e ). Moreover, Eq.dTTt can also be intuitively explained from 
the random walk interpretation of the voter model. In particular, starting from node i, if we perform a random walk 
for an infinite number of steps, the probability that the random walk stops at j is given by the stationary distribution 
7r(j). For balanced graphs, if i and j are from the same component (either S or S), then the random walk must pass an 
even number of negative edges, so i takes the same color as j; if i and j are from opposite components, then the walk 
passes an odd number of negative edges and i takes the opposite of j's color. Thus, the steady distribution of i E S 
being white is given by tt^xos + 7r |' (lg — x os)< an< ^ tne case of z e 5 is symmetric. Some algebra manipulations can 
lead us to Eq.dT7Ti. 

For a balanced ergodic digraph G with partition 3, S, it is easy to check that it has the following two equilibrium 
states: in one state all nodes in S are white while all nodes in S are black; and in the other state all nodes in S are 
black while all nodes in S are white. We call these two states the polarized states. Using random walk interpretation, 
we show in the following theorem that with probability 1, the voter model dynamic converges to one of the above two 
equilibrium states. 

Theorem 2. Given an ergodic signed digraph G = (V, E, A), if G is balanced with partition S, 3, the voter model 
dynamic converges to one of the polarized states with probability 1, and the probability of nodes in S being white is 
Ttg(xo — 7}1) + Similarly, if G is anti-balanced, with probability 1 the voter model dynamic oscillates between the 
two polarized states eventually, and the probability of nodes in S being white at even steps is frg(xo — ^7) + \. 

Proof. Consider a balanced ergodic digraph G with partition S, S. By ergodicity, given any two nodes i and j, with 
probability 1 the random walks starting from i and j will meet eventually. If i and j are both in S, when the two 
walks meet at some node u, they both pass either an even number of negative edges (if u £ S) or an odd number of 
negative edges (if u € S). Therefore, i and j must be in the same color with probability 1. If i and j are from different 
components S and 3, a similar argument shows that they will have the opposite color with probability 1. Therefore the 
final state is one of the two polarized states. The probability of nodes in 3 being white is simply given by TheoremQ] 
Eq.(|T7T>. The case of anti-balanced ergodic digraphs can be argued in a similar way. □ 

Theorem [3] below introduces the long-term dynamics of the weakly connected signed digraphs. We consider 
weakly connected G with a single sink ergodic component Gz, and use the same notations as in Section [3~l2"l 
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Theorem 3. Let G = {V, E 1 A) be a weakly connected signed digraph with a single sink component Gz and the rest 
subgraph Gx- The long-term white color distribution vector x is expressed in two parts: 



x T = lim xT = \x T 



XYi ■ h z\ 



where xz is the limit of xtz on Gz with initial distribution xqz and is given as in Theorem\l\ and vector xxy is given 
below with respect to the balance structure of Gz- 

Balanced G z : x X y = \lx + u b^z,s z ( x oz - \lz) 



Strictly unbalanced G z ■' xxy — \l 



x 



Anti-balanced G z , even t: Xxy,b — \lx ~ u u tt^ Sz {xqz ~ \lz) 
Anti-balanced G z, oddt: xxy,o = \lx + u u^z s z ( x oz — \^z) 
where ttf, and u u are defined in Eq. M2\ and Eq.il 3i. 



Proof. Let initial distribution Xq — [x^ x , Xq Z ] and g T = [g x T , g z T ]. When t — > oo, Eq. (Q~|i can be written as 



c T = lim (P t x ) T = [x X y, x z} = [ X X + x Yi x zh 

t— ¥00 



where x x = ]ha t -^oo(P x x ox + Yh=o p x9 x \ x y = ^t^oo{ p y x oz + J2t=o P Y 9z)< and x z = 
]im t ^oo(Pz x oz + Eto p z9 z )- 

From Lemma[2] limt_ > . 00 P x = 0, thus xx = (Ix — Px)~ 1 9x holds for any ergodic Gz- Since Gz is ergodic, 
xz follows TheoremQ] Below we will focus on deriving xy, where the first part of xy satisfies Lemma[2] i.e., 

(0 Gz is strictly unbalanced 

u b ^ z x oz Gz is balanced 

—u u 7r z s z x ° z Gz is anti-balanced, even t 

u u^z s z x ° z ^ z * s ant i-balanced, odd t. 

The second part of xy can be further written down as 



lim Y. P y9z= l™ EE^^fe 

t—1 t=0 i=0 

m— 1 m—t 00 00 

= J™, E E ( P xPyP 1 z)9z = Y,( P x p Y E P z)9z (24) 



m— )-oo 

t=0 i=0 t=0 i=0 



Now we discuss Eq. (l24b under different balance structures of Gz- 

(1) Gz is strictly unbalanced. From Lemma|2] lim^oo P* = 0. Then by Eq.d23l> we directly obtain that x X y = 
hl X . Applying Eq.d23b to X)°^o p z9z m Eq. ((24b . we have 



m ^ 

lim Y. P y9z = o( J ^ - p x)- lp Ylz- 

t=l 



Thus, we obtain the following equation: 



XXY = X X + Xy = {Ix - Px) 1 {g X + 7} P Ylz) = \^X- 



(2) Gz is balanced. Using Eq.(f22]i. we have 

^J1 P y9z = ~(Ix - Pxr lp Y(lz - \z. Sz ^ Sz \z) 



m— >oo * — * " 2 

t=l 
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Hence, we have 



x X y = (Ix - Px) 1 {g x + 2 P y1z) + u b Tr^ Sz (x 0Z - -l z ) = Tjix + u bTTz,s z ( x oz - -lz) (25) 

(3) Gz is anti-balanced. Using Lemma[3] we can negate the signs of all edges in G so that the sink becomes balanced. 
Hence, we know that at even steps in long term, 

XXY,e = ~lx - u u ttz. Sz (xoz - (26) 

where Eq.(l26b and Eq.(l25l) are identical in the sense that Px's and Py's in Eq.(l26b and Eq.(l25t have opposite signs. 
Moreover, the odd step influence distribution sequence is obtained 

XXY,o = PxX X Y,e + PyX Z ,c + 9 X = T^X + u ^Z,Sz ( X "Z ~ \^ Z ^' < * 27 " ) 

□ 

Theorem[3]characterizes the long-term dynamics when the underlying graph is a weakly connected signed digraph 
with one ergodic sink component. We can see that the results for balanced and anti-balanced sink components are 
more complicated than the ergodic digraph case, since how non-sink components are connected to the sink subtly 
affects the final outcome of the steady state behavior. In steady state, while the sink component is still in one of the 
two polarized states as stated in Theorem [2] the non-sink components exhibit more complicated color distribution, 
for which we provide probability characterizations in Theorem[3] Using Eq.dTsTl. Theorem[T]and Theorem|3]can be 
readily extended to the case with more than one ergodic sink components and disconnected digraphs. 



4 Influence maximization 

With the detailed analysis on voter model dynamics for signed digraphs, we are ready now to solve the influence 
maximization problem. Intuitively, we want to address the following question: If only at most k nodes could be 
selected initially and be turned white while all other nodes are black, how should we choose seed nodes so as to 
maximize the expected number of white nodes in short term and in long term, respectively? 

4.1 Influence maximization problem 

Influence maximization objectives. We consider two types of short-term influence objectives, one is the instant 
influence, which counts the total number of influenced nodes at a step t > 0; the other is the average influence, which 
takes the average number of influenced nodes within the first t steps. These two objectives have different implications 
and applications. For example, political campaigns try to convince voters who may change their minds back and forth, 
but only the voters' opinions on the voting day are counted, which matches the instant influence. On the other hand, 
a credit card company would like to have customers keep using their credit card service as much as possible, which 
is better interpreted by the average influence. When t is sufficiently large, it becomes the long-term objective, and 
long-term average influence coincides with long-term instant influence when the dynamic converges. 

Formally, we define the short-term instant influence ft(xo) an d the short-term average influence ft(xo) as follows: 

Mx ) := l T x t (x ) and f t (x ) := ^=o^ x °\ (28 ) 
Moreover, we define Zong term influence as 

f(x ) := lim ^=off°\ (29) 

i->oo t + 1 
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Note that when the dynamic converges (e.g. ergodic balanced or ergodic strictly unbalanced graphs), f(xo) = 
lim^oo ft(xo). For ergodic anti -balanced graphs (or sink components), it is essentially the average of even- and 
odd-step limit influence. 

Given a set W C V, Let ew be the vector in which ew(j) = 1 if J G W and ew(j) = if j ^ W, which 
represents the initial seed distribution with only nodes in W as white seeds. Let be the shorthand of eu\. Unlike 
unsigned graphs, if initially no white seeds are selected on a signed digraph G, i.e., xq = 0, the instant influence 
ft (0) at step t is in general non-zero, which is referred to as the ground influence of the graph G at t. The influence 
contribution of a seed set W do not count such ground influence, as shown in definition[3] 

Definition 3 (Influence contribution). The instant influence contribution of a seed set W to the t-th step instant 
influence objective, denoted by Ct(W), is the difference between the instant influence at step t with only nodes in W 
selected as seeds and the ground influence at step t: c t (W) = ft(ew) ~ ft(0). The average influence contribution 
Ct(W) and long-term influence contribution c(W) are defined in the same way: dt(W) = ft(evy) — ft(0) and 
c(W) = f(e w ) - f(0). 

We are now ready to formally define the influence maximization problem. 

Definition 4 (Influence maximization). The influence maximization problem for short-term instant influence is find- 
ing a seed set W of at most k seeds that maximizes W's instance influence contribution at step t, i.e., finding 
W* = argmaxivi/M<fc Ct(W). Similarly, the problem for average influence and long-term influence is finding 
= argmaxiy^i<fc Gt(W) andW* — argmax|iy|<fe c(W), respectively. 

We now provide some properties of influence contribution, which lead to the optimal seed selection rule. By 
Eq.jl), we have 

Ct(W) = f t (e w ) - f t (0) = l T x t (e w ) - l T x t (0) = \ T P l e w . (30) 

Let Ct(i) be the shorthand of c t ({i}), and let c t = [c t (i)] denote the vector of influence contribution of individual 
nodes. Then cf = [ct(i)] T = 1 T P t . When t — > oo, the long term influence contributions of individual nodes are 
obtained as a vector c: 

c t = lim 2^= c i = lim Li^LL . (31) 

f^oo t+1 t^oo t+1 

When P* converges, we simply have 

c T = 1 T lim P*. (32) 

t— >oo 

Lemma|4]below discloses the important property that the influence contribution is a linear set function. 

Lemma 4. Given a white seed set W, c t (W) = J^iew c * W> d t(W) = J^iew c*(*)> and c ( w ) = J2 ie w c ( i )- 
Proof. From Eq.d30t. we have 

Ct (W) = \ T P l e w = \ T P l ei = lTpt ^ = Ct ^- 

The linearity of c t and c can be derived from that of Ct. □ 

Given a vector v, let n + (v) denote the number of positive entries in v. By applying LemmaH] we have the optimal 
seed selection rule for instant influence maximization as follows. 

Optimal seed selection rule for instant influence maximization. Given a signed digraph and a limited budget k, 
selecting top minjfc, n + (c t )} seeds with the highest Ct(i) 's, i £ V, leads to the maximized instant influence at step 
t > 0. 

Note that the influence contributions of some nodes may be negative and these nodes should not be selected as 
white seeds, and thus the optimal solution may have less than k seeds. The rules for average influence maximization 
and long-term influence maximization are patterned in the same way. Therefore, the central task now becomes the 
computation of the influence contributions of individual nodes. Below, we will introduce our SVIM algorithm, for 
Signed Voter model Influence Maximization. 
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4.2 Short-term influence maximization 



By applying Definition [3] and Lemma [4] we develop SVIM-S algorithm to solve the short-term instant and average 
influence maximization problem, as shown in Algorithm [T] 

Algorithm 1 Short-term influence maximization SVIM-S 
1: INPUT: Signed transition matrix P, short-term period t, budget k; 
2: OUTPUT: White seed set W. 
3: c t = 1; 5t = 1; 
4: for i = 1 : t do 

5: c[ = c[P;(for instant influence maximization.) 
6: Ct — Ct + Ct', (for average influence maximization.) 

7: W = top min{ k, n + (ct)} (resp. min{/c, n + (c t )}) nodes with the highest ct(i) (resp. Ct(i)) values, for instant 
(resp. average) influence maximization. 



SVIM-S algorithm requires t vector-matrix multiplications, each of which takes \E\ times entry-wise multiplica- 
tion operations. Hence the total time complexity of SVIM-S is 0{t ■ \E\). 

4.3 Long-term influence maximization 

We now study the long-term influence contribution c and introduce the corresponding influence maximization algo- 
rithm SVIM-L. We will see that the computation of influence contribution c and seed selection schemes depends on 
the structural balance and connectedness of the graph. While seed selection for balanced ergodic digraphs still has 
intuitive explanations, the computation for weakly connected and disconnected digraphs is more involved and less 
intuitive. 

4.3.1 Case of ergodic signed digraphs 

When the signed digraph G = (V, E, A) is ergodic, Lemma [5]below characterizes the long-term influence contribu- 
tions of nodes, with respect to various balance structures. 

Lemma 5. Consider an ergodic signed digraph G = (V, E, A). If G is balanced, with bipartition S and S, the 
influence contribution vector c = (\S\ — |5|)7rs. If G is anti-balanced or strictly unbalanced, c = 0. 

Proof. (1) When G is balanced, by Lemma[T]and Eq.d32l>. 

c T = 1 T lim P* = l T i S 7Ts =(\S\- \S\)tt s . 

(2) When G is strictly unbalanced, again by Lemma[T]and Eq.d32K we have c T = 1 T lim^^ P l = 0. 

(3) When G is anti-balanced, by Lemma[T]and Eq.OTTl, we have 

c T = lT lim^oo P 2t + lim Moo P 2t+1 = 
C 2 

□ 

Based on Lemma|5] Algorithm |2] summarizes how to compute the long-term influence contribution c on ergodic 
signed digraphs. 

Lemma|5] suggests that for ergodic balanced digraphs, we should pick the larger component, e.g., S, if |5| > \S\, 
and select the top minjfc, |5|} nodes from S with the largest stationary distributions as white seeds. Selecting these 
nodes will make the probability of the larger component being white the largest. 

Theorem[U indicates that given an anti-balanced digraph G, with bipartition S and S, the long-term dynamic x t 
oscillates on odd and even steps, and their long-term influence contribution is 0. However, we can still maximize the 
strength of the oscillation of the voter model on an anti-balanced ergodic digraph by properly choosing the initial white 
seeds (See RemarkQ]) 
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Algorithm 2c — ergodic(G) 



1: INPUT: Signed transition matrix P. 
2: OUTPUT: Long term influence contribution vector c 
3: Detect the structure of ergodic signed digraph G; 
4: if G is balanced, with bipartition S and S then 
5: Compute stationary distribution n of P; 

6: C=(\S\-\S\)7T S ; 

i: else 

8: c = 0; 



Remark 1. In an anti-balanced ergodic digraph G = (V, E, A) with the bipartition S and S and a budget k. Let 
W (resp. W") denote two initial seed sets, where minjfc, \ S\} (resp. min{fc, \ S\}) nodes, with highest stationary 
distribution Tr(i)'s in S (resp. S), are selected. Then, the optimal W* that maximizes the strength of oscillation is 

W* := argmax |7rf(e w - -2)|. (33) 

W<E{W ,W"} * 

Proof. From Theorem [TJ when t becomes sufficiently large, the vector x oscillates at two vectors on odd and even 
steps, respectively. The strength of the oscillation is 

\fo(x ) ~ fe{xo)\ |lT go(g0) -X e {x ) T i „ T 1 - T 1 

2 = I 2 ' = ' IsttsOo - 2 1 )! = \\ s \ - \ s \\ ■ Fsi^o - 2 1 )!- 

Let W be the initial seed set, then the oscillation strength maximization is formulated as 



max \\S\ — \S\\ ■ \ng(ew — — 1)| = \\S\ — \S\\ ■ max{ max {^ew} — -x^s^-i max {-frjevi/} + -t^\}, (34) 

|W|<fe 2 |VK|<fc 2 |w|<fc 2 

which contains two sub-problems, i.e., max|vi/|</c{7rs e\v} an d maxi^i<fc{— 7r^ ew}- The first maximization problem 
can be rewritten as 

max {^ew} — max ( 7r(i)eiv(i) — /J 7r (*) e w(i)) • (35) 
|w|<fc |w|<fc , eS 

Thus, let W' denote the optimal solution to the problem in Eq.(f35l>. which is obtained by choosing min{fc, \ S\} seeds 
with highest 7r(z)'s from S. Similarly, choosing min{fc, \S\} nodes with the highest 7r(i)'s from S yields the optimal 
solution, denoted by W", to the second maximization problem maxi^i<fe{— Ttgew}- The optimal W to the problem 
in eq.d34b that maximizes the oscillation strength is the one in {W, W"}, with higher tt^ (ew — ^1)|, which completes 
the proof of eq.d33l>. 

□ 



4.3.2 Case of weakly connected signed digraphs 

We first consider a weakly connected signed G which has a single ergodic sink component Gz with only incoming 
edges from the remaining nodes X = V \ Z. 

Lemma 6. Consider a weakly connected digraph G = (V, E, A) with a single ergodic sink component Gz- If Gz is 
balanced, with partition Sz and Sz, the long term influence contribution vector c T = [c^, c^], where cx = Ox and 
c z = {lx u b + \Sz\ — \Sz\)^z,s z - tfG is anti-balanced or strictly unbalanced, c = 0. 
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Proof. (l)When Gz is balanced, by Lemma|2j cx = Ox, and 

c z = i l x u b + lziz,s z )Ttz, Sz = (1a' w & + \Sz\- \Sz\)^z, Sz - 

(2) When Gz is strictly unbalanced, c T = \ T lim^oo P* = 

(3) When Gz is anti-balanced, by Lemma [2] the limits of odd and even subsequences of P* cancel out, thus 
c = 0. □ 

Lemma [6] indicates that influence contribution of the balanced ergodic sink component is more complicated than 
that of the balanced ergodic digraph. This is because the sink component affects the colors of the non-sink component 
in a complicated way depending on how non-sink and sink components are connected. Therefore, the optimal seed 
selection depends on the calculation of the influence contributions of each sink node, and is not as intuitive as that for 
the ergodic digraph case. 

Theorem[3]shows that in a weakly connected signed digraph G, with single anti-balanced sink component Gz, the 
long term influence f(xo) oscillates on odd and even steps, and the average is |V|/2, which is invariant to the initial 
seed selection. Similar to RemarkQ] we can maximize the oscillation strength by properly selecting initial seeds, i.e., 

W* = argmax|/ e (e w ) - f {e w )\/2 

\W\<k 

= argmax|(l x M„7rf i5 . z + lz^-z,Sz^z,s z )( e wz ~ \lz)\ 

\W\<k L 

= \l T x u u + \Sz\ - \Sz\\ ■ argmax|7rf Sz (e W z - ^z)\ (36) 

\W\<k ' 1 

where the maximization objective is independent from xqx, thus oscillation strength maximization problem objective 
in Eq.(f36l> for G is identical to that in Remark[T] Hence, Remark[T]also applies here. 

Using Eq.dTsb. Lemma|5]and Lemma|6]can be readily extended to the case with more than one ergodic sink com- 
ponents and disconnected digraphs. Algorithm |3]below summarizes how to compute the node influence contributions 
of weakly connected signed digraphs. Note that by our assumption, we consider all sink components to be ergodic. 



Algorithm 3 c = weakly(G) 
1: INPUT: Signed transition matrix P. 
2: OUTPUT: Influence contribution vector c. 

3: Detect the structure of the weakly connected signed digraph G, and find its m > 1 signed ergodic sink components 

Gzi, ■ ■ ■ , Gzm, 
4: for 2 = 1: vn do 

5: if Gzi is balanced with partition Szi, Szi then 
6: Compute stationary distribution irzi of Pzi ; 

7: U bl = (I x - Px)~ 1 PYdzi,Szi> 

8: CZi = {l x Ubi + \Szi\ - \Szi\)K T zi^ Sz :, 
9: c = [0 x ;c zl ; ■ ■ ■ ;c Zm ] 



4.3.3 General case and SVIM-L algorithm 

Given the above systematic analysis, we are now in a position to summarize and introduce our SVIM-L algorithm 
which solves the long-term voter model influence maximization problem for general aperiodic signed digraphs. 

In general, a signed digraph consists m > 1 disconnected components, within each of which the node influence 
contribution follows Lemma [6] The long-term signed voter model influence maximization (SVIM-L) algorithm is 
constructed in Algorithm^ 

Complexity analysis. We consider G = (V, E, A) to be weakly connected, since disconnected graph case can be 
treated independently for each connected component for the time complexity. SVIM-L algorithm consists of two 
parts. The first part extracts the connectivity and balance structure of the graph, which can be done using depth-first 
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Algorithm 4 Long-term influence maximization S VIM-L 
1: INPUT: Signed transition matrix P, budget k. 
2: OUTPUT: White seed set W. 

3: Detect the structure of a general aperiodic signed digraph G, and find the m > 1 disconnected components 

Gi, • • • , G rn ; 
4: for z = l: Tn do 

5: Cd = weakly(Gi); 
6: c = [c Gl ; • ■ • ;c Gm ]; 

1: W = top min{fc, n + (c)} nodes with the highest c(i) values. 



search with complexity 0(|i?|). The second part uses Algorithm [3] to compute influence contributions of balanced 
ergodic sink components. The dominant computations are on the stationary distribution irzi's and (Ix — Px)~ > 
which can be done by solving a linear equation system 11331 and matrix inverse in 0(|Z;| 3 ) and 0(n x ), respectively, 
where nx = \X\. Let b be the number of balanced sink components in G, nz be the number of nodes in the largest 
balanced sink component. Thus SVIM-L can be done in 0(bn%+n x ) time. Alternatively, we can use iterative method 
for computing both nzi's and l x (Ix — Px) X , if the largest convergence time tc of P|i' s anc ^ Pjc i s sma ll (note 
that the convergence time of ergodic digraphs could be exponentially large in general, as illustrated by an example in 
Appendix let. In this case, each iteration step involves vector-matrix multiplication and can be done in O(ms) time, 
where ms is the number of edges of the induced subgraph G b consisting of all nodes in the balanced sink components 
and X. Note that ms and tc are only related to subgraph Gb, which could be significantly smaller than G, and thus 
Oitcms) could be much smaller than the time of naive iterations on the entire graph. Overall SVIM-L can be done 
in 0(|I£| + min(6n| + hI^cWb)) time. 

5 Evaluation 

In this section, we use both synthetic datasets and real social network datasets to evaluate the efficacy of our short-term 
and long-term seed selection schemes. For different scenarios, we compare our SVIM-L and S VIM-S algorithms with 
three heuristics, i.e., (1) selecting seed nodes with the highest weighted outgoing degrees (denoted by d + + d~ in the 
figures), (2) highest weighted outgoing positive degrees (denoted by d + ), (3) highest differences between weighted 
outgoing positive and negative degrees (denoted by d + — d~). Our evaluation results demonstrate that our seed 
selection scheme can increase up to 72% long-term influence, and 51% short-term influence over other heuristics. 

5.1 Synthetic datasets 

In this part, we generate synthetic datasets with different structures to validate our results. 

Dataset generation model. In our tests, we generate six types of signed digraphs, including balanced ergodic di- 
graphs, anti-balanced ergodic digraphs, strictly unbalanced ergodic digraphs, weakly connected signed digraphs, dis- 
connected signed digraphs with ergodic components, and disconnected signed digraph with weakly connected com- 
ponents (WCCs). All edges have unit weights. The following are graph configuration details. 

We first create an unsigned ergodic digraph G with 9500 nodes, which has two ergodic components Ga and Gb, 
with [3000, 6500] nodes and [3000, 6500] x 8 random directed edges, respectively. Moreover, there are 3000 x 8 
random directed edges across Ga and Gb- Ergodicity is checked through a simple connectivity and aperiodicity 
check. Given G, a balanced digraph is obtained by assigning all edges within Ga and Gb with positive signs, and 
those across them with negative signs. Then, an anti-balanced digraph is generated by negating all edge signs of the 
balanced ergodic digraph. To generate a strictly unbalanced digraph, we randomly assign edge signs to all edges in G 
and make sure that there does not exist a balanced or anti-balanced bipartition. 

Moreover, we generated a disconnected signed digraph and a weakly connected signed digraph for our 
study. We first generate 5 ergodic unsigned digraphs, G\,--- , G5 with [500,200,800,300,2700] nodes and 
[500,200,800,300,2700] x 8 edges, respectively. Then, we group G 23 = (G 2 ,G 3 ) and G 45 = (G 4 ,G 5 ) to form 
two ergodic balanced digraphs, and generate a strictly unbalanced ergodic digraph G\ by randomly assigning signs 
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to edges in G\. Three disconnected components G\, G23, G45 together form a disconnected signed digraph. To form 
a weakly connected signed digraph, we place in total 3000 random direct edges from G\ to the balanced ergodic 
components G23 and G45, where the nodes in subgraph G\ only have outgoing edges to G23 and G45. Moreover, 
we combine the above generated balanced ergodic digraph and the weakly connected signed digraph together forming 
a larger disconnected signed digraph, with the weakly connected signed digraph as a component. Fig. 1 1 (a)j -Fig. |l(f)| 
present the evaluation results for one set of digraphs, where we observe that all digraphs we randomly generated 
exhibit consistent results. Our tests are conducted using Matlab on standard PC server. 
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Figure 1 : Long-term influence maximization on various signed digraphs 



5.1.1 Long-term influence maximization 

In the evaluations, we set the influence budget as k = 500. Fig. |l(a)| shows that in the balanced ergodic digraph, SVIM- 
L algorithm achieves the highest long-term influence over other heuristics with up to 14% influence increase. Fig. |l(b)| 
shows the clear oscillating behavior on the anti-balanced ergodic digraph, and the average influence is the same for all 
algorithms, [removed a sentence] The inset shows that our algorithm (denoted as "Max. Osc") indeed provides the 
largest oscillation, which confirms RemarkQ] Fig. |l(c)| shows the results in strictly unbalanced graph case, where the 
long-term influences of all algorithms converge to 4750 = |V|/2, which matches TheoremQ] Fig. |l(d)| and Fig. 1 1 (e)| 
show that SVIM-L algorithm increases up to 72% long-term influence over other heuristics in the weakly connected 
signed digraph and the disconnected signed digraph. Fig. |l(f)| shows that in a more general signed digraph, which 
consists of a weakly connected signed component and a balanced ergodic component, SVIM-L algorithm outperforms 
all other heuristics with up to 17% more long term influence. In general, we see that for weakly connected and 
disconnected digraphs, SVIM-L has larger winning margins over other heuristics than the case of balanced ergodic 
digraphs (Fig. |l(d)"] - |l(f)| vs. Fig j 1 ( a)| > . We attribute this to our accurate computation of influence contribution in the 
more involved weakly connected and disconnected digraph cases. Moreover, in all cases, the dynamics converge very 
fast, i.e., in only a few steps, which indicates that the convergence time of voter model on these random graphs are 
very small. 
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5.2 Real datasets 



In this section, we use real datasets, namely, Epinions and Slashdot datasets, to validate our theoretical results and our 
SVIM algorithm. 



Table 2: Statistics of Epinions and Slashdot datasets 



Statistics 


Epinions 


Slashdot 


Statistics 


Epinions 


Slashdot 


# of nodes 


131580 


77350 


# of nodes in largest SCC 


41441 


26996 


# of edges 


840799 


516575 


# of edges in largest SCC 


693507 


337351 


# of positive edges 


717129 


396378 


# of positive edges in largest SCC 


614314 


259891 


# of negative edges 


123670 


120197 


# of negative edges in largest SCC 


79193 


77460 








# of strongly connected components 


88361 


49209 



5.2.1 Epinions Dataset 
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(c) Maximize average influence for each t 
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Figure 2: Short-term influence maximization on the entire Epinions dataset. 

Epinions.com lfl"5l is a consumer review online social site, where users can write reviews to various items and vote 
for or against other users. The signed digraph is formed with positive or negative directed edge (it, v) meaning that 
u trusts or distrusts v. The statistics are shown in Table |2] We apply our short-term SVIM-S algorithm to the entire 
Epinions digraph as well as the largest strongly connected component (SCC), and compare it with three heuristics, i.e., 
d+ +dr, d+ andd+ - dr. 

Our first batch of tests are on the entire digraph. We first look at the seed selection schemes for maximizing the 
instant influence at step t. Fig. |2(a)| shows the expected maximum instant influence at each step by different methods. 
Note that since the initial seeds selected by SVIM-S algorithm hinge on t, the values on the curve of our selection 
scheme are associated with different optimal initial seed sets. On the other hand, the seed selection of other heuristics 
are independent to t, thus the corresponding curves represent the same initial seed sets. We can see from Fig. |2(a)| 
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Figure 3: Short-term influence maximization on the largest SCC of Epinions dataset. 

that with different budgets, i.e., 500 and 6fc seeds, our SVIM-S algorithm outperforms other heuristics. For example, 
SVIM-S algorithm has up to 47% more influence over other heuristics for t = 2. Fig. |2(b)| shows how the influence 
evolves over steps, given that the goal is to maximize the influence at step 3, where SVIM-S algorithm achieves up to 
27% more influence at step 3. 

Next we compare the seed selection scheme for maximizing the average influence within the first t steps. Fig. |2(c)| 
shows the expected maximum average influence within the first f-th steps by different methods. Again, the values 
on the curve of SVIM-S algorithm are associated with different initial seed sets. Fig. |2(c)| shows that with different 
budgets, i.e., 500 and 6fc seeds, SVIM-S algorithm outperforms other heuristics with up to 26% more influence when 
t = 6. Fig. |2(d)| shows how the expected average influence evolves over steps for t = 3, where SVIM-S algorithm 
achieves up to 36% more average influence over the first 3 steps. Moreover, in all these figures, we observe that our 
seed selection scheme results in the highest long-term influence over other heuristics. 

In the second set of simulation results, we compare the influences obtained by different schemes on the largest 
strongly connected component (SCC) of Epinions dataset, which is ergodic and strictly unbalanced. In Fig. |3(a)f - 
|3(d)[ we show similar results as that in the entire Epinions dataset, where SVIM-S algorithm always outperforms 
other heuristics. The slight difference is that the performance of algorithms converges faster than on the entire graph, 
because SCC has better connectivity than the entire graph. 



5.2.2 Slashdot Dataset 

Slashdot.org |3~T1 provides a discussion forum on various technology-related topics, where members can submit their 
stories, and comment on other members' stories. Its Slashdot Zoo feature allows members to tag each other as friends 
or foes, which in turn forms a signed online social network. The network was collected on 6-th November 2008 |24| 
and the statistics are shown in Table [2] 

We evaluate our SVIM-S algorithm on the entire slashdot dataset (Fig. |4(a)|4(d)[ i and its largest strongly connected 
component (Fig. |5(a)] -Fig. |5(dj) , respectively, and our results show that our SVIM-S algorithm performs the best among 
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all methods tested, especially in the early steps. 

Moreover, the convergence times for both real-world datasets are fast, in a few tens of steps, indicating good 
connectivity and fast mixing property of real- world networks. In summary, our evaluation results on both synthetic 
and real-world networks validate our theoretical results and demonstrate that our S VIM algorithms for both short term 
and long term are indeed the best, and often with significant winning margins. 
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Figure 4: Short-term influence maximization on the entire Slashdot dataset. 



6 Conclusion 

In this paper, we propose and study voter model dynamics on signed digraphs, and apply it to solve the influence 
maximization problem. We provide rigorous mathematical analysis to completely characterize the short-term and 
long-term dynamics, and provide efficient algorithms to solve both short-term and long-term influence maximization 
problems. Extensive simulation results on both synthetic and real-world graphs demonstrate the efficacy of our signed 
voter model influence maximization (SVIM) algorithms. We also identify a class of anti-balanced digraphs, which is 
not covered in the social balance theory before, and exhibits oscillating steady state behavior. 

There exist several open problems and future directions. One open problem is the convergence time of voter 
model dynamics on signed digraphs. For balanced and anti-balanced ergodic digraphs, our results show that their 
convergence times are the same as the corresponding unsigned digraphs. For strictly unbalanced ergodic digraphs and 
more general weakly connected signed digraphs, the problem is quite open. A future direction is to study influence 
diffusion in signed networks under other models, such as the voter model with a background color, the independent 
cascade model, and the linear threshold model. 
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Figure 5: Short-term influence maximization on the largest strongly connected component of Slashdot dataset. 

7 Acknowledgement 

We would like to thank Christian Borgs and Jennifer T. Chayes for pointing out the relations between the signed 
digraph voter model and concepts in physics, such as Ising model and Gauge transformations. We also thank Zhenming 
Liu for many useful discussions on this work. This work was mostly done while the first author was working as full- 
time intern at Microsoft Research Asia. 



References 

[1] M. Angeles Serrano, K. Klemm, F. Vazquez, V. Egufluz, and M. San Miguel. Conservation laws for voter-like 
models on random directed networks. Journal of Statistical Mechanics: Theory and Experiment, 2009:P10024, 
2009. 

[2] S. Bharathi, D. Kempe, and M. Salek. Competitive influence maximization in social networks. In WINE, 2007. 

[3] C. Borgs, J. Chayes, A. Kalai, A. Malekian, and M. Tennenholtz. A novel approach to propagating distrust. 
Internet and Network Economics, 2010. 

[4] A. Borodin, Y. Filmus, and J. Oren. Threshold models for competitive influence in social networks. In WINE, 
2010. 

[5] F. Brandt, T. Sandholm, and Y. Shoham. Spiteful bidding in sealed-bid auctions. In IJCAI, 2007. 

[6] C. Budak, D. Agrawal, and A. E. Abbadi. Limiting the spread of misinformation in social networks. In WWW, 
2011. 



23 



[7] W. Chen, A. Collins, R. Cummings, T. Ke, Z. Liu, D. Rincon, X. Sun, Y. Wang, W. Wei, and Y. Yuan. Influence 
maximization in social networks when negative opinions may emerge and propagate. In SDM, 201 1. 

[8] W. Chen, C. Wang, and Y. Wang. Scalable influence maximization for prevalent viral marketing in large-scale 
social networks. In KDD, 2010. 

[9] W. Chen, Y. Wang, and S. Yang. Efficient influence maximization in social networks. In KDD, 2009. 

[10] W. Chen, Y. Yuan, and L. Zhang. Scalable influence maximization in social networks under the linear threshold 
model. In ICDM, 2010. 

[11] K. Chiang, N. Natarajan, A. Tewari, and I. Dhillon. Exploiting longer cycles for link prediction in signed 
networks. 2011. 

[12] F. R. K. Chung. Laplacians and the cheeger inequality for directed graphs. Annals of Combinatorics, 9:1-19, 
sep 2005. 

[13] P. Clifford and A. Sudbury. A model for spatial conflict. Biometrika, 60(3):581, 1973. 

[14] D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. 
Cambridge, 2010. 

[15] Epinions. Dataset. http://www.epinions.com/. 

[16] E. Even-Dar and A. Shapira. A note on maximizing the spread of influence in social networks. In WINE, 2007. 

[17] A. Goyal, F. Bonchi, and L. V. S. Lakshmanan. A data-based approach to social influence maximization. PVLDB, 
5(l):73-84, 2008. 

[18] A. Goyal, W. Lu, and L. V. S. Lakshmanan. Simpath: An efficient algorithm for influence maximization under 
the linear threshold model. In ICDM, 201 1. 

[19] R. Holley and T. Liggett. Ergodic theorems for weakly interacting infinite systems and the voter model. The 
annals of probability, 1975. 

[20] D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of influence through a social network. In KDD, 
2003. 

[21] M. Kimura and K. Saito. Tractable models for information diffusion in social networks. In PKDD, 2006. 

[22] J. Kunegis, S. Schmidt, A. Lommatzsch, J. Lerner, E. W. D. Luca, and S. Albayrak. Spectral analysis of signed 
graphs for clustering, prediction and visualization. In SDM, 2010. 

[23] J. Leskovec, D. Huttenlocher, and J. Kleinberg. Predicting positive and negative links in online social networks. 
In WWW, 2010. 

[24] J. Leskovec, D. Huttenlocher, and J. Kleinberg. Signed networks in social media. In CHI. ACM, 2010. 

[25] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. M. VanBriesen, and N. S. Glance. Cost-effective outbreak 
detection in networks. In KDD, 2007. 

[26] Y. Li and Z.-L. Zhang. Random walks on digraphs, the generalized digraph laplacian and the degree of asymme- 
try. InLNCS WAW, 2010. 

[27] N. Masuda and H. Ohtsuki. Evolutionary dynamics and fixation probabilities in directed networks. New Journal 
of Physics, 11:033012, 2009. 

[28] J. Morgan, K. Steiglitz, and G. Reis. The spite motive and equilibrium behavior in auctions. The BE Journal of 
Economic Analysis & Policy, 2(1): 1102-1 127, 2003. 

24 



[29] R. Narayanam and Y. Narahari. Determining the top-k nodes in social networks using the shapley value. In 
AAMAS, 2008. 

[30] N. Pathak, A. Banerjee, and J. Srivastava. A generalized linear threshold model for multiple cascades. In ICDM, 
2010. 

[31] Slashdot. Dataset. http://slashdot.org/. 

[32] V. Sood, T. Antal, and S. Redner. Voter models on heterogeneous networks. Physical Review E, 77(4):041 121, 
2008. 

[33] W. Stewart. Numerical methods for computing stationary distributions of finite irreducible markov chains. Com- 
putational Probability, 2000. 

A Properties of ergodic digraphs 

Proposition 4. Let G = (V, E, A) be an ergodic digraph. For any nodes i, j € V, there exist two paths from i to j 
with even and odd length, respectively. 

Proof. Suppose, for a contradiction, that all paths from i to j have even lengths. This implies that all cycles passing 
through i must be even length, since otherwise we could follow node i's odd-length cycle followed by the even length 
path from i to j, making the entire path from i to j odd. Now we can consider any cycle C r in G, not necessarily 
passing i. We claim that C r must have even length. In fact, we can pick any node u on C r , and construct a path from 
i to j with the following segments: Ri from i to u, C r , R2 from u back to i, and R3 from i to j. Since we know 
that Ri + i?2 has even length and R3 has even length, it must be the case that C r has even length by our assumption. 
However, this means that all cycles in C has even lengths, contradicting to the aperiodicity of G. 

The case of odd length paths can be proved in the same way. □ 

Proposition 5. Let G = (V, E , A) be an ergodic unsigned digraph, with transition probability matrix P and stationary 
distribution vector tt. P* — ln T = (P — 1tt t Y holds for any integer t > 0. 

Proof. Using the facts that PI = 1 and tt t P = ir T , it is easy to prove by induction that for any integer t > 
P* - 1tt t = (P - l7r T )' holds. □ 



B Special matrix power series 

Proposition 6. Let X e W iy m , Y e W nxn and Z 6 K" xn . If]im t ^ 00 X t = lim^oo Z % = 0, the following 
equalities hold: 

t-i 

(i) lim VX' = (I-X)~\ (37) 

t— >oo — » 

8=0 

t-1 

(U) lim Vm'^-^O, (38) 

t— >oo ^ — ' 

i=0 

Proof, (i) Let p(X) be the spectral radius of matrix X, i.e., the largest absolute value of the eigenvalues of X. Notice 
that lim^oc X* = if and only if p(X) < 1. 

We first claim that, I — X and / — Z are invertible. Suppose / — X is not invertible, there is a non-zero vector p 
such that (/ — X)p = 0. Therefore, p is the eigenvector of X with eigenvalue 1, which contradicts Hindoo X 1 = 0. 
Same argument can be applied to I — Z. Hence, the left hand side of Eq.d37|i equals to 

t 

lim Vr = lim (i - xyHi - x t+1 ) = (i-xy 1 . 

i=0 
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(ii) The max-norm of X is given by ||X|| ma:c = maxij< m {Xij}. Let X = QxJQx ^ e standard Jordan form 
of X, where Qx is an invertible matrix. Denote J = 11 T as the all-one matrix. Hence, we have 

K 1 1 1 max — WQxTQ WQxW 

max \\JJ l J\\ max 

-1| 



< IIQxIlmazllQj/llmaz"^!!^! 



J % is in form as 





' A 


c/Ai- 1 

























T = 








A! 





















ril \i— 1 
















A m J 



(39) 



where Cf = ^zfji < * m an d each non-zero entry in J 1 can be expressed as C\\ % k , 1 < k < mo, 1 < ^ < io(k), 
with mo as the number of different eigenvalues of X and ^o(fc) as the multiplicity of the fc-th eigenvalue of X. Hence, 
the absolute value of each non-zero entry in J 1 is upper bounded as \C\ X l k ~ e \ < i m p(X) l ~ m , which implies that 

H^IUs < WQxWmaxWQx'Wmaxm^piXy-™ 

Let p = ma,x(p(X), p(Z)), we have 



where T„ 



lim || V X l Y Z 1 - 1 - 1 ^ < lim tmnWX^^WYlU^WZ 1 - 1 - 

t— foo * — * t^oo 



i=0 



< lim tmnT max {m 2 t m p l ~ m ){n 2 t n p 1 - 1 - 1 - 11 ) < lim m 3 n 3 T max t m+n+1 p*- 1 -™-™ 



£— >-oo 

IF 



max 1 1 1 1 max \ \Q X \\max || Qz II max \ \Q Z II frma; ■ 



□ 



C Illustration of exponential convergence time of P f on ergodic digraph. 




Figure 6: An example digraph with exponential convergence time. All edges are with unit weights. 

Given an unsigned ergodic digraph G = (V, E, A), with transition probability matrix P, it has fixed stationary 
distribution tt, i.e., n T = ir T P. 

The convergence time (or mixing time) of a random walk Markov chain on G is the time until the Markov chain is 
"close" to its stationary distribution n. To be precise, for an initial distribution xo, let xf = XqP 1 be the distribution 
at step t. The variation distance mixing time is defined as the smallest t such that for any subset W C V, 

\(xJ-7r T )e w \ < 1 

where ew is the vector such that ew{i) = 1 if i € W, and ewi}) = if i £ V \ W. 

The convergence time is said to be exponentially large if there exists xq such that the convergence time of the 
random walk starting from xq is 2 n ^ n \ where n — \V\. Lemma [7] below illustrates that the convergence time of 
random walk on ergodic digraphs could be exponentially large. 
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Lemma 7. There exist ergodic digraphs, such that the convergence time of the random walks on these digraphs are 
exponentially large. 

Proof. We prove this by construction. Fig.[6]shows an example digraph G, with |V| = 2m nodes. On the left hand 
side, there are to > 3 nodes L\, L%, ■ ■ ■ , L m connected by m — 1 directed edges from L\ to L m , and every node Lj 
with i > 1 has a directed connection to the leftmost node L\. The right hand side nodes have symmetric connections 
as the left hand side. Moreover, node L m and R m also have one more connection to R\ and L\, respectively, which 
connect two components together. It is clear that the graph is strongly connected and aperiodic (there exist cycles of 
length 2 and 3), and thus ergodic. 

Let Xt(Li) denote the probability that the random walk is at node Li at step t, and x(Lj) be its stationary distri- 
bution. Similarly define x t (Ri) and x(Ri) for node R4. The graph is symmetric, thus we have x(Li) = x(Ri) for 
1 < i < m. Let x(Li) = x(Ri) — p/4, we have x(Lj) = x(Ri) = p/2 1 for i = 2, 3, . . . , to. Then, by solving 
Y^iLi( x (Li) + x(Ri)) = 1, we obtain p = 3 2 2 ,„_^_ 1 . It is easy to verify that indeed the obtained x is the stationary 
distribution of the random walks on the digraph. 

Then, we consider the initial distribution as xo = [1, 0, 0, ... , 0], and the subset W — {Ri, ■ ■ ■ , R m } including 
all to nodes on the right-hand side. Let x f (W) — xj ■ ew denote the total probability that the random walk is in some 
node in W at step t. The only edge from the left half to the right half is the edge from L m to R\. Thus all additions to 
Xt+i{W) from x t (W) comes from this edge, namely x t +i(W) — x t (W) < x t (L m )/2. We now bound x t {L m ). For 
t < m — 1, we know that x t (L m ) = 0. For t > to, we have 

x t (L m ) = x t -i{L m - X )/2 - x t _ 2 (L m _ 2 )/2 2 = • • • = x t _ m+2 (L 2 )/2 m - 2 < 1/2™- 2 . 

Hence, we have 

t 

xt(W) = ^fa(W) - Xi-i(W)) < t ■ x t {L m )/2 < t/2 m -\ 
1=1 

Therefore, the smallest t that satisfies \(xf — ir T )ew\ = \xt(W) — 1/2 < 1/4 is such that x t (W) > 1/4, which 
implies that t/2" 1 ^ 1 > 1/4 and t > 2 m_3 . This completes the proof. □ 
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